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| (57) Abstract 

The present invention provides a method and system for dynamically searching databases in response to a query, and more specifically 
a system and method for dynamic data-mining and on-line communication of customized information. This method includes the step of first 
creating a search-specific profile (15). This search-specific profile is then input into a data-mining search engine (100). The date-mining 
search engine will mine the search-specific profile to determine topics of interests. These topics of interest are output to at least one search 
tool (16). These search tools (16) match the topics of interest to at least one destination data site wherein the destination data sites are 
evaluated to determine if relevant information is present in the destination data site. Relevant information is filtered and presented to the 
! user (10) making the inquiry. 
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METHOD AND SYSTEM FOR DYNAMIC DATA-MINING AND ON-LINE 
COMMUNICATION OF CUSTOMIZED INFORMATION 

RELATED APPLICATIONS 

This application claims benefit of U.S. Provisional 
Application No. 60/095 , 308 filed on August 4, 1998. 
Additionally this application incorporates by reference the 
prior U.S. Provisional Application No. 60/095,308 filed on 
August 4, 1998 entitled "Method and System for Dynamic 
Data-mining and On-line Communication of Customized 
Information" to Ingrid Vanderveldt and U.S. Patent 
Application No. 09/282,392 filed on March 31, 1999 entitled 
"An Improved Method and System for Training an Artificial 
Neural Network" to Christopher L. Black. 

TEOiNICftL FIELD OF THE INVENTION 

This invention relates generally to the use of a 
dynamic search engine and, more particularly, to a dynamic 
search engine applied to the Internet that allows for 
customized queries and relevant responses. 

BACKGROUND OF THE INVENTION 

Current Internet search tools often provide irrelevant 
data sites or web sites. Often, current search tools 
provide a score of relevance according to text frequency 
within a given data site or web page. For example, 
"termites" and "Tasmania" and "not apples'* : 

• If a web page has several instances of the word 
"termites" (600 for example) , the web page would 
receive a high relevance score . 
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• A web page with 600 ^termites" and one "Tasmania" 
would receive a slightly higher score. 

• A web page with the above plus "apples" would then 
receive a slightly lesser score. 

Therefore, a score of relevance according to a data 
site or web page is often based on text or word frequency. 
Therefore current search tools often provide a list of 
irrelevant web pages. Furthermore, there is the 
opportunity for abuse in and associated with the method of 
the available search tools. Current search tools often 
provide links that are stale (old data that is no longer at 
the address of the data site) . Existing search tools 
utilize indices that are compiled in the background 
continuously. However, with respect to an individual 
query, a historical result is received. Therefore, the 
search process involves a large amount of filtering by the 
individual user. 

Therefore, there is a need to more efficiently utilize 
search tools to overcome irrelevant results. At present, 
it is desirable to have an efficient method for performing 
a search which would take into account demographic as well 
as historical user information to filter irrelevant data 
from the results from existing search tools. 

Furthermore, it is desirable to have a search engine 
which will evaluate and filter stale data responses from an 
existing search tool response. 
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SUMMARY OF THE TNVT^Tnxf 

In accordance with the present invention, a method and 
system for searching databases in response to a query is 
provided that substantially eliminates or reduces 
disadvantages and problems associated with previous methods 
and systems for searching databases. 

More specifically, the present invention provides a 
system and method for dynamic data-mining and on-line 
communication of customized information.. This method 
includes the steps of first creating a search- specif ic 
profile. This search-specific profile is then inputted 
into a data-mining search engine. The data-mining search 
engine will mine the search-specific profile to determine 
at least one topic of interest. The at least one topic of 
interest may comprise a specific and/or related topics to 
interest. The at least one topic of interest is outputted 
to at least one search tool. These search tools match the 
at least one topic of interest to at least one destination 
data site. The destination data sites (web page) are 
evaluated to determine if relevant information is present 
in the destination data site. If relevant information is 
present at the destination data site, this data site may be 
presented to a user. 

One broad aspect of the present invention includes the 
coupling of a data-mining search engine to at least one 
search tool. This data-mining search engine can review and 
evaluate data sites. Current search tools available may 
create a massive index of potential data sites. The data- 
mining engine of the present invention evaluates whether 
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data accumulated by current search tools are relevant to a 
user and .filters out non-relevant information. 

The present invention provides an advantage by 
providing a search engine algorithm that provides fresh (as 
opposed to stale) links to more highly relevant web pages 
(data sites) than provided by the current search engines. 
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BRIEF DESCRIPTION OP T HE DRAWINGS 

For a more complete understanding of the present 
invention and the advantages thereof, reference is now made 
to the following description taken in conjunction with the 
5 accompanying drawings in which like reference numerals 
indicate like features and wherein: 

FIGURE 1 shows a diagram of the present embodiment of 
the invention; 

FIGURE 2 illustrates an example of operating the 
10 present invention; 

FIGURE 3 explains the related patent applications to 
the present invention; 

FIGURE 4 depicts the use of a training scheme 
according to the teachings of BLACK; and 
15 FIGURE 5 details a flow chart illustrating the method 

of the present invention. 
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. preferred embodiments o£ the present invention 
illustrated in the PIO^s. 11*. numerais being used to 
refer to li*e and corresponding parts of the various 

drawings. invention, a method and 

In accordance with the present xnvent3.cn, 

• n v searching databases in response to a 
system for dynamically searcnmg reduce s 
query is provided that substantially elites or reduce 
query x& ^ previous methods 

disadvantages and problems associated with p 

and systems for searching databases. 

Le specifically, the present invention provides 
system and method for dynamic data-mining and on- -e 
communication of customized information. This method 

.eludes the steps of 

15 profile. This search-specifi profil ^ 
into a data-mining search engine ; *" to de termine 

engine will mine the -arch-speci P o ^ 
ac least one topic of inter- * ^ 
interest may comprise a sp i a ^ ^ ^ 

20 interest. The topic of interest 

one search tool. These search tools match the top 
interest to at least one destination data site. The 

• ■ sites are evaluated to determine if 

destination data sites ai= 

• „ r „ ent in the destination data 

relevant information is present in 
25 site. If relevant information is present, this data 
is assigned a relevance score and presented to user 

requesting the query. . rludes the 

one broad aspect of the present invention includes 
coupling of a data-mining search engine to at least one 
30 arch tool. This data-mining search engine reviews and 
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evaluates available data and data sites. Current search 
tools available may create a massive index of potential 
data sites.' The data-mining engine of the present 
invention evaluates whether the available data accumulated 
by current search tools are relevant to a user and filters 
out all non-relevant information, creating a more effective 
and efficient search engine. 

In one embodiment, the present invention includes a 
web site containing several data-mining tools. These tools 
fall into two separate categories: a dynamic approach to 
generating a list of links that are well correlated to a 
user provided search string using a novel search strategy 
(e.g., incorporating simple text matching, text 
associations, synonym and near text matching - to handle 
misspellings, profile information, a recursive definition 
of document importance/relevance - import ant /re levant 
documents link to other important/relevant - and weighting 
of the previous factors based upon Al) , and stand-alone 
models (e.g., neural networks and NSET models, as well as 
others known to those skilled in the art) , which would 
provide useful predictions or estimations (such as 
described in the U.S. Patent Application No. 09/282,392 
entitled "An Improved Method and System for Training An 
Artificial Neural Network" filed 31 March 1999 to 
Christopher L. Black, hereafter BLACK. 

The stand alone models would be created with 
implementer or user interaction, and could be ever increase 
in number, as desired and as data was 

discovered/licensed/acquired. Eventually, the web site 
would contain a portal to hundreds of thousands of 
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interesting and useful models. 

Neither the search engine nor the models would 
necessarily be limited to medical information and topxcs. 
However, the present invention primarily focuses on 
5 healthcare-related applications. The system and method of 

a nnt- h>e limited to such healtn 
the present invention need not be limit 

care database. 

The present: invention provides a method for data- 
-ins that provides use o £ .any different A l — * 
10 derived for .any different applications fro. many deferent 
datasets. The present invention provides the benefit of 
neural network training algorithm, genetic 
expert and fuzzy logic systems, decision trees, and other 
m ethods Known to those skilled in the art applied to any 

15 available data. 

secondly, the present invention allows the compact 
storage, retrieval, and use of relationships and patterns 
present in many datasets. each made up of very many 
patterns of examples, each made of several different 

„„ v , lues each requiring several bytes when 
20 measurements or values, =«" 

stored conventionally or explicitly (as in 
database or a flat file) . Single datasets consisting of 
multiple gigabytes and terabytes of data are rout.nely 
being generated, with exabyte datasets looming on the 
25 horizon. «ith the use of multiple modeling techniques 
(different approaches are appropriate to different 
applications, . models encapsulating and summarizing useful 
information contained within hundreds or even thousands of 
these datasets could stored on a single consumer level 
30 personal computer hard drive. 
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FIGURE 1 illustrates one physical implementation of 
the present invention." The number of servers, 
interconnections, software modules, and the like would 
largely be determined by scalability concerns. The web 
site 12 would consist of a graphical user interface (GUI) 
to present dynamically generated indexes and forms that 
allow the user 10 to provide a search profile and submit 
their search requests or feed inputs into a selected Al 
model. The web site 12 could reside upon a single or on a 
standard farm of web server machines. Search engine 
requests 15 would be provided to a single or a farm of 
search machines 16, which would either query a static 
public or proprietary databases 18/indices of links either 
pre-created (and continually updated) or licensed from, for 
example, Yahoo and other link search engines. This static 
list (formed from data sites 18) would provide a starting 
point for a dynamic (live) search. Both search 
machines/machine farms 16 would require extremely high 
speed access to the Internet or other like data networks. 

Data-mining is the process of discovering useful 
patterns and relationships within data. This is typically 
accomplished by training and then applying a neural 
network, or inducing and then applying a decision tree, or 
applying a genetic algorithm, etc. Once the training 
aspect of many of the techniques is performed, the result 
is the data-mining tool (e.g., a trained neural network - 
into which someone who knows nothing about Al can. simply 
input values and receive results) . 

Data-mining "tools" are discrete and specific. 
Certain models are appropriate for certain tasks. When 
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explanation of a particular result is important (as in 
credit approval/rejections) , and the available data 
supports the generation/elation of rules, an expert or 
fuly icic svste. might b e appropriate. When opt^za - 
o£ a particular quantity is important, a genetic algorrthm 
or another evolutionary algorithm might be more 
When prediction/estimation is important, the neura! network 

training algorithm might be used. 

The Dynamic Search Engine 100 can extract /prov a de 

useful information from publicly and freely available 

. nrP o Pnt invention can do the 

databases 18. However, the present in 

same with proprietary databases 18. 

One embodiment of the present invention incorporates 
an enhanced version of simple text matching (allowing 
r educed weight for synonym and possible misspelling 
etches) at the first level, associations with prof He 
information provides a second metric of relevance (e.g 
certain words and word combinations are found to correlate 
with interest for people providing certain combinations of 
search profile factors, . The final metric is —er other 
articles possessing high (normalized) relevance (us.ng 
3 ievels - a recursive definition) lin* to the page rn 
^estion. « so, then the relevance as established by thrs 



25 



metric is high. 



30 



The spidering/crawling/roboting starts from the static 
index found in response to the initial o^ery 15 of 
cabases IS. Data sites included in the index are s— 
and assigned relevance using the 3 facts above. Data Srtes 
with high levels of relevance are scanned deeper (a UnKS 
are followed, as well as the linKs revealed on those 
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subsequent pages) than non-rel«vant pages. After a maximum 
number of links have been followed, or the total relevance 
of pages indexed exceeds a threshold, the search stops and 
results 20 are returned to user 10, organized by a weighted 
conglomeration of the 3 factors (generated by a neural 
network trained upon the user profile and previous searches 
and relevance results) . 

For the pre -created models, the present invention also 
has a page indexing the available canned models that the 
user could simply choose from. Alternatively, based upon 
text entered at the dynamic search engine GUI 12, the 
dynamic search engine could suggest appropriate models, 
where appropriate (e.g., if user enters blue book, the 
present invention could return at the top of a list of 
links, a link to a used car value estimator neural 
network) . 

FIGURE 2 illustrates one embodiment of the present 
invention wherein the search tools comprise a privately 
licensed search tool 22 accessing privately held databases 
24 and publicly available database 18 accessed by search 
tools provided by YAHOO, EXCISE, LYCOS and other search 
tools known to those skilled in the art. 

FIGURE 3 provides an overall description of three 
processes which occur within Figures 1 and 2. Process 30 
illustrates the dynamic search engine application which 
performs the function of mining search profile data as 
provided from user 10 via GUI 12. Mining or cross 
referencing the search profile data against subject 
information includes the dynamic search capabilities of 
evaluating data sites 18. Process 32 in Figure 1 
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iUu.tr.f - the interaction between a user !.. the dynamic 
.earch engine a„a an available search to* «. ^ ^ 
accesses individual web sites IB. -arch too! 1 ^ each 
individual may he customed to the protocols assocated 
wit „ each search engine. Process 34 illustrates the 
process between a user 10, a dynamic search engine o f t e 
present invention and a proprietary, search engine when the 

-^-r-v eparch engine accessing 
search tool 16 is a proprietary search g 

proprietary databases. 

The improvements to previously existing artificial 
neural networ* training methods and systems mentioned in 
th. various embodiments of this invention can occur in 
junction with one another (sometimes even to address the 
same probXem, . FIGURE 4 demonstrates one way in which the 
, Tious embodiments of an improved method for tra^ng an 
artificial neural network (**> can be implemented and 
scheduled. FIGURE 4 does not demonstrate how 
re presentative dataset selection is accomplished, but 
instead starts at train net bloc, I0X with representative 

>0 training dataset already selected. 

The training dataset at block 101 can consist 

initiaIXy of one Kind of pattern that is randomXy se Xected. 
depending on whether or not clustering is used. Where 

*<. hakes Dlace prior to any other 
clustering taKes place it taKes p P 

05 data selection. Assuming, as an exampx 

has been employed to select twenty training patterns. «. 
oan then be randomXy initialed, all the ^ 
r andomXy initialized around zero, and » can take those 20 
random y calcu late the gradient and 

data patterns and for each 
30 multiply the gradient by the initial value 
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rate. The adaptive learning rate is user-definable, but is 
usually initially set around unity (1) . For each of the 
representative data patterns initially selected, the 
training algorithm of this invention calculates the 
incremental weight step, and after it has been presented 
all twenty of the data patterns, it will take the sum of 
all those weight steps. All of the above occurs at train 
net block 101. 

From train net block 101, the training algorithm of 
this invention goes to step 102 and determines whether the 
training algorithm is stuck. Being stuck means that the 
training algorithm took too large a step and the prediction 
error increased- Once the training algorithm determines 
that it is stuck at block 104 it decreases the adaptive 
learning rate by multiplying it by a user- specif ied value. 
A typical value is 0.8, which decreases the learning rate 
by 20%. 

If the training algorithm reaches block 102 and 
determines there has been a decrease in the prediction 
error (i.e., it is not stuck), the training algorithm 
proceeds to block 108 and increases the learning rate. The 
training algorithm returns to block 101 from block 108 to 
continue training the ANN with a now increased adaptive 
learning rate. 

The training algorithm proceeds to block 106 after 
decreasing the adaptive learning rate in block 104 and 
determines whether it has become "really stuck." "Really 
stuck" means that the adaptive learning rate decreased to 
some absurdly small value on the order of 10" 6 . Such a 
reduction in the adaptive learning rate can come about as a 
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result of the training algorithm landing in a local minimum 
in the error surface. The adaptive learning rate will 
normally attempt to wiggle through whatever fine details 
are on the error surface to come to a smaller error point 
5 However, in the natural concavity or flat spot of a local 
minimum there is no such finer detail that the training 
algorithm can wiggle down to. In such a case the adaptive 
learning rate decreases to an absurdly low number. 

If at block 106, if the training algorithm determines 

10 that it is really stuck (i.e., that the learning rate has 
iteratively decreased to an absurdly small value) , it 
proceeds to block 110 and resets the adaptive learning rate 
to its default initial value. In the event that the 
training algorithm is not really stuck at block 106, it 

15 returns to block 101, recalculates the weight steps, and 
continues training with newly-modified weights. The 
training algorithm continues through the flow diagram, as 
discussed above and below. 

Once the adaptive learning rate is reset at block 110, 

20 the training algorithm proceeds to block 112, where it 
determines whether the minimum in which it is currently 
stuck is the same minimum in which it has been stuck in the 
past (if it has been stuck before) . This is because as the 
training algorithm is learning it will sometimes get out of 

25 a local minimum and wind up in the same minima at a future 
time. If it finds itself stuck in the same minimum, the 
training algorithm checks, at block 114, whether it has 
achieved a maximum on the gaussian distribution from which 
a random value is chosen to perturb the weights (i.e., 

30 whether the maximum jog strength has been achieved) . The 
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"maximum jog strength" is the maximum value from the 
gaussian distribution.' If the maximum jog strength has 
been achieved, at block 116 the training algorithm resets 
the jogging strength. 

The jogging strength is reset at block 116 because the 
problem is not so much that the training algorithm has 
found itself in a local minimum, but that the ANN is not 
complicated enough. The training algorithm moves to block 
118 and determines whether it has, prior to this point, 
trimmed any weights. "Trimming weights" means to set those 
weights to zero and take them out of the training 
algorithm. The procedure for trimming of weights will be 
described more fully with respect to FIGURE 13 below. 

If at step 118 the training algorithm determines that 
weights have previously been trimmed (i.e., that the 
weights have been previously randomly affected but the 
training algorithm still wound up in the same minimum 
because the network was not complex enough to get any more 
accuracy out of the mapping) , the training algorithm moves 
to step 120 and untrims 5% of the weights. This means that 
weights that were previously trimmed are allowed to resume 
at their previous value, and from this point on they will 
take part in the training algorithm. The training 
algorithm returns to step 101 and continues to train as 
before . 

By untrimming 5% of the weights, the training 
algorithm returns a little more complexity back to the 
model in hopes of decreasing the prediction error. If 
prediction error does not decrease, the training algorithm 
will once again reach a local minimum and the training 
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•~ =>+• Vilock 112 whether it 
algorithm will determine once again at block 

i. stuck in the sane minimum as before. Note, however, 
that at bio* HO the adaptive learning rate is reset 

i~ v -it^7 issue of untrimmmg 
before addressing the complexity issue 

•it- takes some iterations 
previously trimmed weights, so it takes 

, , n1 in2 104 106 and 110 before getting 
through blocks 101, 102, 104, 

back to the process of untrimming any more weights, 
event the training algorithm does wind up in the same 
minimum, the maximum Jog strength will not have been 
, reached, since it was previously reset at "°<* ^ 

prior iteration. Instead, the training algorithm will 
proceed to block 136. At block 13, the weights are .ogged^ 
proceeu cliahtlv increased 

and at block 140 the jogging strength is 

according to a gaussian distribution. Following bl ck 40, 
5 the training aigorithm proceeds to train net block 101 
continues training. 

a n f t-raininq the training algorithm 
If in the course of trainmy 

again reaches the same minimum, the procedure above is 
repeated. In the event the jog strength once 
,. the maximum level at block 114, the training algorithm 

resets the jogging strength as previousiy discussed. If 
th e training algorithm reaches block US after several 
r ounds of untrimming weights that there are no longer any 

• v. ►>,. training algorithm proceeds along the 
trimmed weights, the training « a 

M "no" path to block 122. 

*t block 122, the training aigorithm determines if 
th is is the first time it has maxed out the jog strength on 
th is size M.. The training algorithm keeps a counter of 
how many times the jog strength has maxed out with an «, 
30 of a given size. « this is the fist time the 3 og strength 
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has maxed out for the current ANN size, the training 
algorithm proceeds along the "yes" path to block 124 and 
completely re-initializes the ANN. All of the weights are 
re- initialized and the ANN is restarted from scratch. The 
training algorithm proceeds to block 101 and commences 
training the net anew. The ANN, however, remains whatever 
size it was in terms of number of hidden layers and number 
of nodes when training resumes at train net block 101 with 
the newly re-initialized weights. 

At block 122, if the answer is "no," the training 
algorithm proceeds along the "no" path to block 126. At 
block 126 the training algorithm has already maxed out the 
jog strength more than once for the current size ANN. 
Block 126 tests to see how many new nodes have been added 
for the current state of the representative training 
dataset. The training algorithm determines if the number 
of new nodes added for this size ANN is greater than or 
equal to five times the number of hidden layers in the ANN. 
If the number of new nodes added is not equal to or in 
excess of 5 times the number of hidden layers in the ANN, 
the training algorithm proceeds along the "no" path to 
block 128. At block 128, a new node is added according to 
the procedures discussed above and the training algorithm 
proceeds to train net block 101 to continue training the 
artificial neural network with the addition of the new 
node. The training algorithm of this invention will then 
proceed as discussed above. 

If the number of new nodes added exceeds five times 
the number of hidden layers, the training algorithm 
proceeds along the "yes" path from block 126 to block 130. 
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At block the trains algorithm determines whether 

„e„ layer has previous been added to the " the 

trainin, algorithm has net previously added a new la^ 
Uince the last time it added a training data pattern, . 
Leeds alon 9 the -no- path to block », and adds a new 
layer to the artificial neural network The training 
algorithm then proceeds to bloc, X01 and continues to - 

, aa-a isver If a new layer has 
the net with the newly added layer. 

u 1=^ f-raininq pattern was added, tne 
b een added since the last P 

training algorithm proceeds along the yes pa 



' I£ a new layer has previously been added, it means 
th at the training algorithm has previously added a number 
o£ nodes, has the weights a number o « ~ 

B added a layer because o £ the new training data patter 

h as been added in the previous iteration. The training 
has oeen au. training 
algorithm decides by going to block 134 that th 

rf^d recently is an out-lier and does not fit 
data pattern added recently x 

in with the other patterns that the neural network 

= *r block 134 the training 
, 0 recognizes. In such a case, at block _ 

' algo rithm removes that training data pattern 

representative training dataset and also removes it from 

, * records from which the training 

the larger pool of data records 

algorithm is automatically selecting the training 
25 Th e training algorithm once again proceeds to train net 
bl ock 101 and continues to train the network without the 

deleted data pattern. 

• „ work 112 if the training algorithm 
Returning to block 112. u 
d ecides that it has not fallen into the same minimum, it 
30 proceeds along the "no- path to block !3S. « block 13S. 
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the training algorithm resets the jogging strength to give 
only a small random perturbation to the weights and 
parameters in an attempt to extricate itself from a new 
local minimum. If the training algorithm reaches a new 
local minima, we want the training algorithm to start over 
again. It is desirable to reset the jogging strength 
because to give a small random perturbation to the weights 
and parameters. The intent is to start off with a small 
perturbation and see if it is sufficient to extricate the 
training algorithm from the new local minimum. 

After resetting the jogging strength in block 138, the 
training algorithm proceeds to block 13 6 and jogs the 
weights. The training algorithm proceeds to block 140, 
increases the jogging strength, and proceeds to block 101 
and trains the net with the newly increased jogging 
strength. 

FIGURE 4 thus gives us an overview in operation of the 
various embodiments of the training algorithm of BLACK. 

FIGURE 5 provides a flow chart of the present 
invention illustrating one method of dynamic data-mining. 

At step 202, user 10 arrives at a GUI 12 and logs on. 
Once logged in, the system queries the user for their 
specific search profile. * 

Once the user has entered the data, the specific 
profile is output to data-mining search engine 12 at step 
204. 

In step 206, the dynamic search engine 100, data mines 
the specific profile to determine what other related 
topics of interest would be relevant and of greatest to the 
user 10. 
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The information is categorised so that it oan be 
transferred to both existing and future search engines. 

These related topics of interest are fed bacx to user 
10 in step 20S user 10 then determines the topic outputs 
the specific and related topics to be researched.. The 
dynamic search engine then connects existing pubUc and 
proprietary search tools IS. 

At step 210. the information is transferred, over the 
xntemet. or other lifce communication pathway, to other 
sites and/or licensed search tools <*ahoo, Lycos or others 
^own to those sKilled in the art, to find matching the 

search query 15. 

At step 212. information is gathered from the search 

destination site(s) pertaining to the request. 

A t step 214, information is sent, from the search 
engine (Yahoo, etc.) to the dynamic search engxne. 
Relevant information is gathered from the destinat.on 

databases. , . _ ^--rnh 

The information is sent back to the data-mmng search 

engine 14 at which point the information is cross- 
referenced to the user's profile. Depending on the 
profile, the presentation will rate, weigh and 
each search to present the most relevant and related topics 

of interest. . 

■n v,o resented back to the user m 
The information will be presented d 

a way such as: 

. The most relevant topics/areas of interest: #1-10 
. The most related topics/area of Interest: #1-10 
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This information will include subjects such as areas 
of interest that have shown to have a strong correlation 
and/or relationship to the specific topic of interest. 

Once the user has received the information, they will 
be asked if they would like to see more information. Each 
time the user requests additional information, it will be 
presented in subsequent to the most recent, most relevant, 
information previously presented. 

Over time, the profile information database will 
continue to grow and become more intelligent. Therefore, 
each subsequent searches will become more intelligent and 
relevant to the previous user. This data will continue to 
collect in a profile database located within Dynamic search 
engine 14. Over time, one can monitor the searches, and 
rate each search a success or failure (or some degree of 
one or the other) , to then optimize with Artificial Neural 
Nets and Genetic algorithms, or other empirical techniques 
used in conducting the search. 

The Dynamic search engine becomes an intelligent agent 
that specifically pulls back better (and more recent - also 
implying more thorough) results than the static search 
engines that require more user information. Results are 
specifically searched for with user needs expressed prior 
to the search. Resulting in explicitly tailored searches 
to a user request. 

One embodiment of the present invention provides for a 
mult i- component tool, with six main interacting components 
- Web servers, Highspeed Internet Connections, Web pages, 
Health-related Databases, Database Query and Responses 
Scripts/Code, and the Dynamic Internet Search Scripts/Code. 
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. Th e web servers are the colter equipment, operating 
sy stems. and communications software that wiU contain »d 
eLcute the web pages, (GUI. 2 and Dynamic search engine 
I The equipment may also contain the databases, provide 
highspeed Internet connections, and perform the database IS 

w M This equipment may be configured 

P-herals, and software^ 

Initially , on ly one system must * ^ 
use grows a search respon ^ ^ enable 

(and a scalability strategy developed) . This 

, servers necessary per user, 

projection of the number of servers 

^ -, „v-Avided by similar weo 

a-r-rived from data proviaea 
Estimates may be arrivea 

service companies. 

The communication pathways, Highspeed Internet 

... ftf Tls T3s, or other connections 
connections, consist of Tls, isb. 

, , ■ *-v,~ art Those connections 
known to those skilled m the art. mo 

provide wide-bandwidth communication to and fro™ the entire 
Internet, and any associated equipment which is not 
considered a part of the web server. *s with the web 
servers, the amount of necessary bandwidth will be a 
function of number of concurrent users. 

we b pages (GUI, 12 present search promts and results 
via the Xnternet to user 10 and define the interface to 
system of the present invention to the ^ 

The web pages define the format of the query 

The Query pages must have multiple 
search result pages. The query e a 

fi»v<hilitv in searching (which 
forms/options to allow flexibility in 

databases to query, simple/Boolean forms, whether to sea ch 
the internet, how deep/long to search the Internet. *e. . 
the intern wiU talce ^tiple forms, depending 

The search result pages win 
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on the specified request, but will include relevance 
scores, titles and links, and summaries, much as resulting 
from internet search engine requests. For internet search 
results, links would lead to web. pages. For other database 
results, the links would lead to graphical/textual reports 
for each "hit." 

The present invention may utilize databases containing 
licensed and public domain. This component includes only 
bare-data and "pre-processing" thereof. Data-mining (e.g., 
a hypothetical diagnostic tool "what illness you probably 
have" based upon a neural network trained from a 
symptom/illness database) and analysis are considered part 
of the following component and its development. 

The database query scripts direct the simple searching 
and querying of the databases, accesses custom data-mining 
solutions developed for some of the databases, and allows 
visualization for exploration of the databases. These 
scripts are also responsible for returning the results of 
searches in the HTML format design. 

Each data -mining tool to be implemented may be custom 
developed for the appropriate database. Such tools will 
continue to be added, as appropriate data becomes available 
to the present invention, even after deployment of the 
system. 

These scripts, based upon the text -based query, and 
possibly a demographic and historical search profile, 
perform a "blind" an "dynamic" search of world wide web 
pages, returning those deemed most "relevant." This search 
is blind, in that prior to the search, no index (such as 
those compiled and used by existing search engines) has 
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been generated. This search will be dynamic, in that 
colly to the manne, in which other search engines return 
their results (based upon a pre-compiled though 
continuous updated index, the web is searched anew with 

each request. (ad , us table by the user) results 

Based upon the top N (adjustable oy 
returned by the static search, the dynamic search would 
assign a relevance to each page. The dynamic search would 

of the links contained in 
* »cnider" to each or tne -lj-u«.= 
then proceed to spiaer u 

each page, according to a function of the relevance, 
search would spider several levels beyond extremely 
r elevant pages, and none beyond irrelevant pages. As 
Usted below, initially the relevance function would 
consist of simple text matching and counting of Keyword 
occurrences (as do the other -rch engines. K 
Based upon a historical profile of search 

and failures as well as £ields 
technologies from artificial intelligence and °^er J eld 
„ U 1 optimize the relevance rating function. ™ ^ 
tool is used (especiaUy by a particular user, the better 
it wil! function at obtaining the desired information 
earlier in a search. The user will not have to be a 
computer or information scientist. The user will 3 us b 
aW are that with the same input the user might — a static 

search engine, the present invention finds more relevant, 
recent and more thorough results than any other 

engines. 

A method and system for dynamically searing 

r„ a auery is provided by the present 
databases in response to a query P 

,,,, - esvstem and method ror 
invention. More specifically, a system 
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dynamic data-mining and on-line communication of customized 
information. This method includes the steps of first 
creating a search-specific profile. This search- specif ic 
profile is then inputted into a data-mining search engine. 
The data-mining search engine will mine the search- specif ic 
profile to determine topic of interests. These topics of 
interest are outputted to at least one search tool. These 
search tools match the topics of interest to at least one 
destination data site wherein the destination data sites 
are evaluated to determine if relevant information is 
present in the destination data site. Relevant information 
is filtered and presented to the user making the inquiry. 

The present invention provides an advantage by 
providing a search engine algorithm that provides fresh (as 
opposed to stale) links to more highly relevant web pages 
(data sites) than provided by the current search engines. 

Although the present invention has been described in 
detail, it should be understood that various changes, 
substitutions and alterations can be made hereto without 
departing from the spirit and scope of the invention as 
described by the appended claims. 
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Wff aT T ff ^T.ATMED IS: 

1. A method. of "dynamically searching databases in 
response to a query, comprising the steps of: 

profiling a user to create a user-specific profile; 
5 inputting said user-specific profile to a data-mining 

search engine; 

mining said user-specific profile to determine at 

least one topic of interest; 

output ting said at least one topic of interest to at 

10 least one search tool; 

using said at least one search tool to match said at 
least one topic of interest to at least one destination 
data site; 

evaluating said at least one destination data site for 
15 relevant information; and 

presenting said relevant information to said user. 

2. The method of Claim 1, wherein said at least one 
topic of interest further comprises specific and related 

20 topics of interest. 

3. A dynamic search engine comprising: 
a server system; 

a software program executed on said server system 
25 wherein said software program is operable to provide a 

graphical user interface to a user in which a search query 

may be received; 

a data-mining engine operable to receive said search 

query; 
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at least one search tool coupled to said data-mining 
engine operable to execute said search query and receive a 
response; and 

a filtering system to evaluate said response and pass 
relevant response data from said response to said user. 
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